In [1]:
__nbid__ = '0010'
__author__ = 'Robert Nikutta <robert.nikutta@noirlab.edu>, Stéphanie Juneau <stephanie.juneau@noirlab.edu>, Knut Olsen <knut.olsen@noirlab.edu>, David Nidever <david.nidever@noirlab.edu>'
__version__ = '20240603' # yyyymmdd
__datasets__ = ['smash_dr1']
__keywords__ = ['science example', 'dwarf galaxies']

Detecting the Hydra II dwarf galaxy in SMASH DR1¶

Robert Nikutta, Stéphanie Juneau, Knut Olsen, David Nidever, & Astro Data Lab Team

Table of contents¶

  • Goals & notebook summary
  • Disclaimer & attribution
  • Imports & setup
  • Query the SMASH DR1 catalog for blue stars (expected in metal-poor dwarfs)
  • Spatial density plot
  • Detect overdensities (differential Gaussian kernels)
  • Identify peaks
  • Inspect the image cutouts around the peaks (Simple Image Access)
  • CM diagrams of all stars around the peaks
  • Save your results to a local file
  • Resources

Goals¶

We will use a dwarf galaxy detection algorithm using a spatial overdensity of blue stars to re-discover the Hydra II dwarf galaxy in the SMASH DR1 dataset. Hydra II (red circle in the map below) is one of ~60 dwarf galaxy satellites of the Milky Way.

No description has been provided for this image

Summary¶

Background Ultrafaint dwarf galaxies are crucial to understanding many aspects of the universe. For instance, they are dominated by dark matter; their localization in space can thus trace the large-scale structure of the dark matter distribution. Furthermore, dwarf galaxies are suspected to host intermediate-mass black holes (IMBH), which so far have eluded efforts to find them. IMBHs will naturally bridge the gap between the solar-mass black hole and super-massive blackholes that reside at the center of virtually every large galaxy.

Data retrieval We will retrieve data from Field 169 of the SMASH catalog (Nidever et al. (2017, AJ, 154, 199)) and look for overdensities of blue objects.

The required columns are RA, Dec, and the g, r, i magnitudes.

Detection We will convolve the spatial distribution of our dataset with a pair of Gaussian kernels and subtract the results, as done in e.g. Stanford et al. (2005, ApJ, 634, 2, L129) (galaxy clusters), or Koposov et al. (2008, ApJ, 686, 279) (MW satellites). This has the effect of convolving the spatial distribution with a Mexican hat filter, which is useful for detecting objects at a desired spatial scale.

Disclaimer & attribution¶

Disclaimers¶

Note that using the Astro Data Lab constitutes your agreement with our minimal Disclaimers.

Acknowledgments¶

If you use Astro Data Lab in your published research, please include the text in your paper's Acknowledgments section:

This research uses services or data provided by the Astro Data Lab, which is part of the Community Science and Data Center (CSDC) Program of NSF NOIRLab. NOIRLab is operated by the Association of Universities for Research in Astronomy (AURA), Inc. under a cooperative agreement with the U.S. National Science Foundation.

If you use SPARCL jointly with the Astro Data Lab platform (via JupyterLab, command-line, or web interface) in your published research, please include this text below in your paper's Acknowledgments section:

This research uses services or data provided by the SPectra Analysis and Retrievable Catalog Lab (SPARCL) and the Astro Data Lab, which are both part of the Community Science and Data Center (CSDC) Program of NSF NOIRLab. NOIRLab is operated by the Association of Universities for Research in Astronomy (AURA), Inc. under a cooperative agreement with the U.S. National Science Foundation.

In either case please cite the following papers:

  • Data Lab concept paper: Fitzpatrick et al., "The NOAO Data Laboratory: a conceptual overview", SPIE, 9149, 2014, https://doi.org/10.1117/12.2057445

  • Astro Data Lab overview: Nikutta et al., "Data Lab - A Community Science Platform", Astronomy and Computing, 33, 2020, https://doi.org/10.1016/j.ascom.2020.100411

If you are referring to the Data Lab JupyterLab / Jupyter Notebooks, cite:

  • Juneau et al., "Jupyter-Enabled Astrophysical Analysis Using Data-Proximate Computing Platforms", CiSE, 23, 15, 2021, https://doi.org/10.1109/MCSE.2021.3057097

If publishing in a AAS journal, also add the keyword: \facility{Astro Data Lab}

And if you are using SPARCL, please also add \software{SPARCL} and cite:

  • Juneau et al., "SPARCL: SPectra Analysis and Retrievable Catalog Lab", Conference Proceedings for ADASS XXXIII, 2024 https://doi.org/10.48550/arXiv.2401.05576

The NOIRLab Library maintains lists of proper acknowledgments to use when publishing papers using the Lab's facilities, data, or services.

Imports and setup¶

In [2]:
# std lib
import warnings
warnings.filterwarnings('ignore') # to suppress some astropy depracation warnings

# 3rd party
import numpy as np
from astropy import utils, io, convolution, stats
from astropy.visualization import make_lupton_rgb
from photutils import find_peaks
from pyvo.dal import sia
import pylab as plt
%matplotlib inline

# Data Lab
from dl import authClient as ac, queryClient as qc
from dl.helpers.utils import convert

# plots default setup
plt.rcParams['font.size'] = 14

Query the SMASH DR1 catalog for blue stars (expected in metal-poor dwarfs)¶

We will query the averaged photometry table from the SMASH catalog and select field #169. We'll issue two queries, one in which we select blue (-0.4 < g-r < 0.4) star-like objects (|sharp|<0.5), and the other without those constraints, to be used for comparison. For both queries, we will avoid photometry taken only with short exposures (depthflag>1), exclude objects with fewer than 4 detections, and apply a magnitude constraint of 9 < g < 25, so as to avoid excessive contamination.

Construct the query strings¶

In [3]:
field = 169 # SMASH field number to query
# Create a query of all sources in the Field, for comparison
query_all =\
"""SELECT ra,dec,gmag,rmag,imag
   FROM smash_dr1.object
   WHERE fieldid = '{:d}' AND
         depthflag > 1 AND
         ndetr > 3 AND ndetg > 3 AND
         gmag BETWEEN 9 AND 25 AND random_id<10""".format(field)
In [4]:
# Create the query string; SQL keyword capitalized for clarity
#   depth > 1 = no short exposures please
#   ndetr, ndetg > 3 = more than 3 detections in r & g bands
#   abs(sharp) < 0.5 = avoid broad objects
query =\
"""SELECT ra,dec,gmag,rmag,imag
   FROM smash_dr1.object
   WHERE fieldid = '{:d}' AND
         depthflag > 1 AND
         ndetr > 3 AND ndetg > 3 AND
         abs(sharp) < 0.5 AND
         gmag BETWEEN 9 AND 25 AND
         (gmag-rmag) BETWEEN -0.4 AND 0.4""".format(field)

Submit the query¶

Running the query in synchroneous mode is very easy.

In [5]:
df_all = qc.query(sql=query_all,fmt='pandas') # response is by default a CSV-formatted string
print("Number of objects:", len(df_all))
print(df_all.head())
Number of objects: 117455
           ra        dec       gmag       rmag       imag
0  184.876934 -32.868654  24.867119  24.545872  24.546236
1  184.892606 -32.874418  24.309147  25.094023  24.796555
2  184.894265 -32.873554  24.608845  24.403374  24.662037
3  184.895268 -32.872408  24.590645  24.494490  24.497213
4  184.896137 -32.871733  24.677746  24.908146  24.395891
In [6]:
R = qc.query(sql=query,fmt='pandas') # response is by default a CSV-formatted string
print("Number of objects:", len(R))
print(R.head())
Number of objects: 104974
           ra        dec       gmag       rmag       imag
0  184.876674 -32.873511  24.746605  24.838743  24.185682
1  184.876606 -32.870861  24.156397  24.068817  23.074945
2  184.875853 -32.867214  24.084047  24.028060  23.630045
3  184.877080 -32.869780  24.482061  24.446104  23.858896
4  184.878492 -32.866905  24.678942  24.714973  24.624266

Spatial density plot¶

You might spot some overdensities already, particularly in the result with blue star-like objects.

In [7]:
fig, (ax1, ax2) = plt.subplots(1,2,figsize=(14,7), sharey=True, constrained_layout=True)
h1 = ax1.hexbin(df_all['ra'], df_all['dec'],gridsize=200)
ax1.set_xlabel('RA')
ax1.set_ylabel('Dec')
ax1.set_title('~100000 random objects')
h2 = ax2.hexbin(R['ra'], R['dec'],gridsize=200)
ax2.set_xlabel('RA')
ax2.set_title('~100000 blue stellar objects')
fig.colorbar(h1,label='number of objects per spatial bin');
No description has been provided for this image

Detect overdensities (differential Gaussian kernels)¶

Here we define the dwarf filter as a differential convolution of a two-dimensional image using two Gaussian kernels; this has the effect of convolution with a Mexican hat filter. The default kernel shapes look for objects on the scale of a few arcmin. The output includes a clipped array of the convolved spatial distribution, which we will use for peak detection.

In [8]:
def dwarf_filter (ra,dec,fwhm_small=2.0,fwhm_big=20):

    """Differential convolution with 2D Gaussian kernels.
    
       Based on Koposov et al. (2008).
       Code by Ken Mighell and Mike Fitzpatrick.
       Minor edits by RN.
       
       Parameters
       ----------
       ra, dec : float or array
           RA & Dec in degrees.
    
       fwhm_small, fwhm_big : float
           Full-width half maximum sizes of the small and big Gaussian kernels
           to use in convolution, in arcminutes.
    """
    
    x, y = ra, dec

    print("Computing differential convolution .... ",)

    # Information about declination (y) [degrees]
    ymean = (y.min() + y.max()) / 2.0
    ydiff_arcmin = (y.max() - y.min()) * 60.0 # convert from degrees to arcmin

    # Information about right ascension (x) [degrees in time]:
    xdiff = x.max() - x.min() # angular separation [degrees (time)] 
    xmean = (x.min() + x.max()) / 2.0

    # convert from degrees in time to separation in angular degrees:
    xdiff_angular = (x.max() - x.min()) * np.cos(ymean*(np.pi/180.0))

    # convert from degress to arcmin
    xdiff_angular_arcmin = xdiff_angular * 60.0 

    # Get the number of one-arcmin pixels in the X and Y directions:
    nx = np.rint(xdiff_angular_arcmin).astype('int')
    ny = np.rint(ydiff_arcmin).astype('int')

    # Create a two-dimensional histogram of the raw counts:
    Counts, xedges, yedges  = np.histogram2d (x, y, (nx,ny) )
    extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
    raw_hist = np.rot90(Counts).copy() # hack around Pythonic weirdness

    # Make the small and big Gaussian kernels with a standard deviation
    # of the given FWHM in arcmin^2 pixels.
    kernel_small = convolution.Gaussian2DKernel(fwhm_small/2.35,factor=1)
    kernel_big = convolution.Gaussian2DKernel(fwhm_big/2.35,factor=1)

    # Compute the differential convolution kernels.
    conv_big = convolution.convolve(raw_hist, kernel_big)
    conv_small = convolution.convolve(raw_hist, kernel_small)
    conv_delta = conv_small - conv_big
    delta = conv_delta.copy()

    # Compute statistics and the floor
    mean = np.mean(delta, dtype='float64')
    sigma = np.std(delta, dtype='float64')
    sigmaRaw = np.std(raw_hist,dtype='float64')
    median = np.median(delta) # not used
    floor = mean

    print('dwarf_filter: mean = {:g}  sigma = {:g} sigmaRaw = {:g}'.format(mean, sigma, sigmaRaw))

    clipped = delta.copy()
    clipped[delta < floor] = floor

    # Return the computed fields.
    return raw_hist, extent, delta, clipped, sigma

Run the dwarf filter¶

We'll use the default convolution kernels of 2 and 20 arcminutes in size.

In [9]:
small_k, big_k = 2., 20.  # kernel sizes in arcminutes
raw, extent, delta, clipped, dsigma = dwarf_filter(R['ra'],R['dec'],fwhm_small=small_k,fwhm_big=big_k)
Computing differential convolution .... 
dwarf_filter: mean = 0.0890555  sigma = 1.79455 sigmaRaw = 5.33591

Plot the convolved 2D histogram¶

In [10]:
fig, ax = plt.subplots(figsize=(7,6))
im = plt.imshow(clipped)
plt.xlabel('pixel')
plt.ylabel('pixel')
plt.colorbar(label='relative spatial density after convolution');
No description has been provided for this image

Some peaks are visible, let's locate them automatically...

Identify peaks¶

We'll use the photutils package to identify 10-sigma peaks in the clipped filtered image.

In [11]:
# find peaks
mean, median, std = stats.sigma_clipped_stats(clipped,sigma=3.0,maxiters=5)    
tbl = find_peaks(clipped,median+10,box_size=small_k*2)

# add ra & dec positions of peaks found
a, b = extent[:2]
xvec = np.arange(a,b,(b-a)/clipped.shape[1])
a, b = extent[2:]
yvec = np.arange(a,b,(b-a)/clipped.shape[0])

tbl['ra'] = xvec[tbl['x_peak']]
tbl['dec'] = yvec[-tbl['y_peak']-1]
print(tbl)
x_peak y_peak     peak_value             ra                 dec        
------ ------ ------------------ ------------------ -------------------
    86     89  11.31152453793508 185.41055953270728 -31.976003447411664
    34    100 11.512626647783886 184.39348041275326  -32.15954180336243

Show the identified density peaks¶

In [12]:
ecs = ['w','y'] # colors of box frames
ax.scatter(tbl['x_peak'],tbl['y_peak'],marker='s',s=tbl['peak_value']*40,c='none',edgecolors=ecs,lw=3) # keeps writing to previous ax
fig  # repeats (the updated) figure
Out[12]:
No description has been provided for this image

Inspect the image cutouts around the peaks¶

Simple Image Access service¶

Data Lab comes with batteries included. Image cutout and download services are built in.

We'll just write two little functions:

  • one to download the deepest stacked images found in the given bands at a given position in the sky
  • and a function to plot several images side-by-side.
In [13]:
# set up SIA
DEF_ACCESS_URL = "https://datalab.noirlab.edu/sia/coadd_all"
svc = sia.SIAService(DEF_ACCESS_URL)

# a little func to download the deepest stacked images
def download_deepest_images(ra,dec,fov=0.1,bands=list('gri')):
    imgTable = svc.search((ra,dec), (fov/np.cos(dec*np.pi/180), fov), verbosity=2).to_table()
    print("The full image list contains {:d} entries.".format(len(imgTable)))
    sel0 = (imgTable['proctype'] == 'Stack') & (imgTable['prodtype'] == 'image') # basic selection
    images = []
    for band in bands:
        print("Band {:s}: ".format(band)) #, end='')
        sel = sel0 & (imgTable['obs_bandpass'] == band) # add 'band' to selection
        Table = imgTable[sel] # select
        row = Table[np.argmax(Table['exptime'].data.data.astype('float'))] # pick image with longest exposure time
        url = row['access_url'] # get the download URL
        print('downloading deepest stacked image...')
        img = io.fits.getdata(utils.data.download_file(url,cache=True,show_progress=False,timeout=120)) # .decode() b/c in Python 3 url is of "byte" type and getdata() expects "string" type
        images.append(img)
        
    print("Downloaded {:d} images.".format(len(images)))
    return images

# multi panel image plotter
def plot_images(images,titles=list('gri'),cmap=plt.cm.gray_r):
    fig = plt.figure(figsize=(8,10),tight_layout=True)
    grid = plt.GridSpec(4, 3)
    ax1 = fig.add_subplot(grid[0, 0])
    ax2 = fig.add_subplot(grid[0, 1])
    ax3 = fig.add_subplot(grid[0, 2])
    ax4 = fig.add_subplot(grid[1:, 0:])

    for j,ax,img in zip(np.arange(4),[ax1,ax2,ax3,ax4],images):
        ax.imshow(img,origin='lower',interpolation='none',cmap=cmap,norm=plt.mpl.colors.PowerNorm(0.1))
        ax.axis('off')
        ax.set_title('{:s}'.format(titles[j]))
    

Get images for the "left yellow" box¶

Download the deepest stacked image cutouts (in 3 bands) around the position of the peak marked with a yellow box. Depending on network speed and system load, this can take a few seconds. Also create a 3-band false-color composite.

In [14]:
bands = list('gri')
idx = 1
print(tbl['ra'][idx], tbl['dec'][idx])
images = download_deepest_images(tbl['ra'][idx], tbl['dec'][idx], fov=0.1, bands=bands) # FOV in deg
184.39348041275326 -32.15954180336243
The full image list contains 401 entries.
Band g: 
downloading deepest stacked image...
Band r: 
downloading deepest stacked image...
Band i: 
downloading deepest stacked image...
Downloaded 3 images.

Plot the images, plus a false-color 3-band image:

In [15]:
images = [im-np.median(im) for im in images] # subtract median from all images for better scaling
images += [make_lupton_rgb(*images[::-1],stretch=30)] # add a 3-color composite image
plot_images(images,titles=bands+['3-band image'])
No description has been provided for this image

Looks like a galaxy cluster!

Now the "white center box" object¶

In [16]:
idx = 0
images = download_deepest_images(tbl['ra'][idx], tbl['dec'][idx], fov=0.1, bands=bands) # FOV in deg
images = [im-np.median(im) for im in images] # subtract median from all images for better scaling
images += [make_lupton_rgb(*images[::-1],stretch=30)] # add a 3-color composite image
plot_images(images,titles=bands+['3-band image'])
The full image list contains 403 entries.
Band g: 
downloading deepest stacked image...
Band r: 
downloading deepest stacked image...
Band i: 
downloading deepest stacked image...
Downloaded 3 images.
No description has been provided for this image

Definitely not a galaxy cluster, but a stellar overdensity

Query the database for stars near peaks¶

Get photometry of stars within a 5 arcmin radius of each density peak.

In [17]:
def makequery(ra0,dec0,radius0=5./60.,field=169):
    query = """
SELECT ra,dec,gmag,rmag,imag FROM smash_dr1.object
  WHERE fieldid = {:d}
    AND depthflag > 1
    AND abs(sharp) < 0.5
    AND gmag BETWEEN 9 AND 25
    AND q3c_radial_query(ra,dec,{:f},{:f},{:f})
""".format(field,ra0,dec0,radius0)

    return query
In [18]:
query0 = makequery(tbl['ra'][0],tbl['dec'][0]) # center ra & dec
R0 = qc.query(sql=query0, fmt='pandas') # using sql argument instead of the default adql
print(R0.head()) # a Pandas method
           ra        dec       gmag       rmag       imag
0  185.341390 -32.034610  18.762146  17.609804  17.174831
1  185.340696 -32.033947  24.666931        NaN        NaN
2  185.352840 -32.038977  24.814121  24.343794  23.955757
3  185.345295 -32.033874  24.745580  24.538908  24.439932
4  185.348514 -32.033831  24.887840  24.834295  24.795055

Compute color¶

In [19]:
R0['g_r'] = R0['gmag'] - R0['rmag']
print( R0.head())
           ra        dec       gmag       rmag       imag       g_r
0  185.341390 -32.034610  18.762146  17.609804  17.174831  1.152342
1  185.340696 -32.033947  24.666931        NaN        NaN       NaN
2  185.352840 -32.038977  24.814121  24.343794  23.955757  0.470327
3  185.345295 -32.033874  24.745580  24.538908  24.439932  0.206672
4  185.348514 -32.033831  24.887840  24.834295  24.795055  0.053545

Now the other peak¶

In [20]:
query1 = makequery(tbl['ra'][1],tbl['dec'][1])
R1 = qc.query(sql=query1,fmt='pandas')
R1['g_r'] = R1['gmag'] - R1['rmag'] # compute color
print(R1.tail()) # another Pandas methodb
              ra        dec       gmag       rmag       imag       g_r
2170  184.471635 -32.119314  24.889196  24.793587  24.467518  0.095609
2171  184.481076 -32.122957  22.749125  21.331306  20.392641  1.417819
2172  184.470299 -32.112379  23.647750  22.085228  21.719313  1.562522
2173  184.470477 -32.112378  23.919018  22.351788  21.632957  1.567230
2174  184.467250 -32.106465  24.091139  23.340645  22.943100  0.750494

Color-magnitude diagrams of the peaks¶

In [21]:
def plotpanel(axid,x,y,title='',xlim=(-1,2),ylim=(25.2,14)):
    ax = fig.add_subplot(axid)
    ax.scatter(x,y,marker='.',s=10, alpha=0.8)
    ax.set_xlabel(x.name)
    ax.set_ylabel(y.name)
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)
    ax.set_title(title)
In [22]:
fig = plt.figure(figsize=(12,5.5))
plotpanel(121,R0['g_r'],R0['gmag'],'white box, {:d} objects'.format(len(R0)))
plotpanel(122,R1['g_r'],R1['gmag'],'yellow box, {:d} objects'.format(len(R1)))
No description has been provided for this image

Compare w/ Vivas+2016¶

No description has been provided for this image

Save your photmetry table¶

If you wish, you can now save your photometry table for Hydra II to a local file and take it with you.

In [23]:
outfile = 'hydra2.csv'
R0.to_csv(outfile,index=False)

hydra2.csv <-- Download the file as you please.

Some ideas for improvement¶

  • Find overdensities elsewhere (everywhere!)

  • optimize kernel sizes, computation, proper wavelets

Some resources¶

Nidever et al. (2017) "SMASH - Survey of the MAgellanic Stellar History": http://adsabs.harvard.edu/abs/2017AJ....154..199N

Stanford et al. (2005, ApJ, 634, 2, L129) "An IR-selected Galaxy Cluster at z = 1.41": http://adsabs.harvard.edu/abs/2005ApJ...634L.129S

Koposov et al. (2008, ApJ, 686, 279) "The Luminosity Function of the Milky Way Satellites": http://adsabs.harvard.edu/abs/2008ApJ...686..279K

Martin et al. (2015, ApJ, 804, 5) "Hydra II: A Faint and Compact Milky Way Dwarf Galaxy Found in the Survey of the Magellanic Stellar History": http://adsabs.harvard.edu/abs/2015ApJ...804L...5M